Are Random Forests Truly the Best Classifiers?

نویسندگان

Michael Wainberg

Babak Alipanahi

Brendan J. Frey

چکیده

The JMLR study Do we need hundreds of classifiers to solve real world classification problems? benchmarks 179 classifiers in 17 families on 121 data sets from the UCI repository and claims that “the random forest is clearly the best family of classifier”. In this response, we show that the study’s results are biased by the lack of a held-out test set and the exclusion of trials with errors. Further, the study’s own statistical tests indicate that random forests do not have significantly higher percent accuracy than support vector machines and neural networks, calling into question the conclusion that random forests are the best classifiers.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Random Forests of Binary Hierarchical Classifiers for Analysis of Hyperspectral Data

Statistical classification of hyperspectral data is challenging because the input space is high in dimension and correlated, but labeled information to characterize the class distributions is typically sparse. The resulting classifiers are often unstable and have poor generalization. A new approach that is based on the concept of random forests of classifiers and implemented within a multiclass...

متن کامل

Comparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images

Abstract: Knowing the tree species combination of forests provides valuable information for studying the forest’s economic value, fire risk assessment, biodiversity monitoring, and wildlife habitat improvement. Fieldwork is often time-consuming and labor-required, free satellite data are available in coarse resolution and the use of manned aircraft is relatively costly. Recently, unmanned aeria...

متن کامل

Effective Classifiers for Detecting Objects

Several state-of-the-art machine learning classifiers are compared for the purposes of object detection in complex images, using global image features derived from the Ohta color space and Local Binary Patterns. Image complexity in this sense refers to the degree to which the target objects are occluded and/or nondominant (i.e. not in the foreground) in the image, and also the degree to which t...

متن کامل

Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination

Chemical Named Entity Recognition (NER) is the basic step for consequent information extraction tasks such as named entity resolution, drug-drug interaction discovery, extraction of the names of the molecules and their properties. Improvement in the performance of such systems may affects the quality of the subsequent tasks. Chemical text from which data for named entity recognition is extracte...

متن کامل

Measuring the Algorithmic Convergence of Random Forests via Bootstrap Extrapolation

When making predictions with a voting rule, a basic question arises: “What is the smallest number of votes needed to make a good prediction?” In the context of ensemble classifiers, such as Random Forests or Bagging, this question represents a tradeoff between computational cost and statistical performance. Namely, by paying a larger computational price for more classifiers, the prediction erro...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Journal of Machine Learning Research

دوره 17 شماره

صفحات -

تاریخ انتشار 2016

Are Random Forests Truly the Best Classifiers?

نویسندگان

چکیده

منابع مشابه

Random Forests of Binary Hierarchical Classifiers for Analysis of Hyperspectral Data

Comparison of Machine Learning Algorithms for Broad Leaf Species Classification Using UAV-RGB Images

Effective Classifiers for Detecting Objects

Improvement of Chemical Named Entity Recognition through Sentence-based Random Under-sampling and Classifier Combination

Measuring the Algorithmic Convergence of Random Forests via Bootstrap Extrapolation

عنوان ژورنال:

اشتراک گذاری